Interlingua-Based Broad-Coverage Korean-to-English Translation in CCLINC

نویسندگان

  • Young-Suk Lee
  • Wu Sok Yi
  • Stephanie Seneff
  • Clifford J. Weinstein
چکیده

At MIT Lincoln Laboratory, we have been developing a Koreanto-English machine translation system CCLINC (Common Coalition Language System at Lincoln Laboratory). The CCLINC Korean-to-English translation system consists of two core modules, language understanding and generation modules mediated by a language neutral meaning representation called a semantic frame. The key features of the system include: (i) Robust efficient parsing of Korean (a verb final language with overt case markers, relatively free word order, and frequent omissions of arguments). (ii) High quality translation via word sense disambiguation and accurate word order generation of the target language. (iii) Rapid system development and porting to new domains via knowledge-based automated acquisition of grammars. Having been trained on Korean newspaper articles on “missiles” and “chemical biological warfare,” the system produces the translation output sufficient for content understanding of the original document. 1. SYSTEM OVERVIEW The CCLINC The CCLINC Korean-to-English translation system is a component of the CCLINC Translingual Information System, the focus languages of which are English and Korean, [11,17]. Translingual Information System Structure is given in Figure 1. Given the input text or speech, the language understanding system parses the input, and transforms the parsing output into a language neutral meaning representation called a semantic frame, [16,17]. The semantic frame  the key properties of which will be discussed in Section 2.3  becomes the input to the generation system. The generation system produces the target to the generation system, the semantic frame can be utilized for other applications such as translingual information extraction and language translation output after word order arrangement, vocabulary replacement, and the appropriate surface form realization in the target language, [6]. Besides serving as the input question-answering, [12]. In this paper, we focus on the Koreanto-English text translation component of CCLINC. Figure 1. CCLINC Translingual Information System Structure 2. ROBUST PARSING, MEANING REPRESENTATION, AND AUTOMATED GRAMMAR ACQUISITION ∗ This work was sponsored by the Defense Advanced Research Project Agency under the contract number F19628-00-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States Air Force. 1 For other approaches to Korean-to-English translation, the readers are referred to Korean-to-English translation by Egedi, Palmer, Park and Joshi 1994, a transfer-based approach using synchronous tree adjoining grammar, [5], and Dorr 1997, a small-scale interlingua-based approach, using Jackendoff’s lexical conceptual structure as the interlingua, [4]. OTHER LANGUAGES SEMANTIC FRAMES (COMMON COALITION LANGUAGE) SEMANTIC FRAMES (COMMON COALITION LANGUAGE) UNDERSTANDING UNDERSTANDING UNDERSTANDING UNDERSTANDING GENERATION GENERATION GENERATION GENERATION C4I INFORMATION ACCESS C4I INFORMATION ACCESS ENGLISH TEXT OR SPEECH KOREAN TEXT OR SPEECH Report Documentation Page Form Approved

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An interlingua based on domain actions for machine translation of task-oriented dialogues

This paper describes an interlingua for spoken language translation that is based on domain actions in the travel planning domain. Domain actions are composed of speech acts (e.g., requestinformation), attributes (e.g., size, price), and objects (e.g., hotel, flight) and can take arguments. Development of the interlingua is guided by a database containing travel dialogues in English, Korean, Ja...

متن کامل

In Proceedings of ICSLP-98 An Interlingua Based on Domain Actions for Machine Translation of Task-Oriented Dialogues

This paper describes an interlingua for spoken language translation that is based on domain actions in the travel planning domain. Domain actions are composed of speech acts (e.g., requestinformation), attributes (e.g., size, price), and objects (e.g., hotel, flight) and can take arguments. Development of the interlingua is guided by a database containing travel dialogues in English, Korean, Ja...

متن کامل

Using Danish as a CG Interlingua: A Wide-Coverage Norwegian-English Machine Translation System

This paper presents a rule-based Norwegian-English MT system. Exploiting the closeness of Norwegian and Danish, and the existence of a well-performing Danish-English system, Danish is used as an «interlingua». Structural analysis and polysemy resolution are based on Constraint Grammar (CG) function tags and dependency structures. We describe the semiautomatic construction of the necessary Norwe...

متن کامل

Ontology Based Interlingua Translation

In this paper we describe an interlingua translation system from Italian to Italian Sign Language. The main components of this systems are a broad coverage dependency parser, an ontology based semantic interpreter and a grammarbased generator: we provide the description of the main features of these components.

متن کامل

Multi-lingual Sentence Generation from the PIVOT Interlingua

This paper proposes a strategy for French and Spanish sentence generation systems, based on the English generation system. The English generation mode! consists of four procedures, conceptual wording (sentence-structure planning), syntactic selection, ordering and morphological generation. The analysis of linguistic similarities and differences between English, French and Spanish reveals that a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001